extrinsic reward
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
- Information Technology > Artificial Intelligence > Natural Language (0.82)
Discovering Creative Behaviors through DUPLEX: Diverse Universal Features for Policy Exploration
The ability to approach the same problem from different angles is a cornerstone of human intelligence that leads to robust solutions and effective adaptation to problem variations. In contrast, current RL methodologies tend to lead to policies that settle on a single solution to a given problem, making them brittle to problem variations. Replicating human flexibility in reinforcement learning agents is the challenge that we explore in this work.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Africa > Rwanda > Kigali > Kigali (0.04)
- (4 more...)
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Virginia (0.05)
- North America > United States > California > Santa Clara County > Los Gatos (0.04)
- North America > Canada (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Government > Military (0.46)
- Leisure & Entertainment > Games > Computer Games (0.31)
- North America > United States > Michigan (0.04)
- North America > Canada > Quebec > Montreal (0.04)
Diverse Mini-Batch Selection in Reinforcement Learning for Efficient Chemical Exploration in de novo Drug Design
Svensson, Hampus Gummesson, Engkvist, Ola, Janet, Jon Paul, Tyrchan, Christian, Chehreghani, Morteza Haghir
In many real-world applications, evaluating the quality of instances is costly and time-consuming, e.g., human feedback and physics simulations, in contrast to proposing new instances. In particular, this is even more critical in reinforcement learning, since it relies on interactions with the environment (i.e., new instances) that must be evaluated to provide a reward signal for learning. At the same time, performing sufficient exploration is crucial in reinforcement learning to find high-rewarding solutions, meaning that the agent should observe and learn from a diverse set of experiences to find different solutions. Thus, we argue that learning from a diverse mini-batch of experiences can have a large impact on the exploration and help mitigate mode collapse. In this paper, we introduce mini-batch diversification for reinforcement learning and study this framework in the context of a real-world problem, namely, drug discovery. We extensively evaluate how our proposed framework can enhance the effectiveness of chemical exploration in de novo drug design, where finding diverse and high-quality solutions is crucial. Our experiments demonstrate that our proposed diverse mini-batch selection framework can substantially enhance the diversity of solutions while maintaining high-quality solutions. In drug discovery, such an outcome can potentially lead to fulfilling unmet medical needs faster.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Switzerland > Neuchâtel > Neuchâtel (0.04)
- Africa > Rwanda > Kigali > Kigali (0.04)